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INTRODUCTION 

This  project  is  to  explore  an  innovative  CAD  strategy  for  improving  early  detection  of  breast 
cancer  in  screening  mammograms  by  focusing  on  computerized  analysis  and  detection  of  cancers 
missed  by  radiologists.  Due  to  the  unpredictable  difficulty  in  data  collection,  the  first  year  of 
research  was  fallen  behind  the  schedule.  Considering  the  fact  that  there  is  limit  time  and  budget 
left  and  more  importantly  based  on  the  research  we  have  done  so  far,  two  things  were  done  in  the 
past  year.  First  we  requested  a  revision  of  the  Statement  of  Work  to  focus  on  the  important 
research  items,  which  has  been  approved  by  DoD.  Secondly  we  tried  our  best  to  catch  up  the 
schedule.  A  big  progress  was  made  in  the  second  year  research. 

BODY 

Objective  1:  to  determine  the  effect  of  density  pattern  on  cancers  detection 
Accomplishments: 

(1)  Segmentation  of  glandular  regions  in  mammogram 

An  automatic  approach  was  applied  in  mammographic  dense  tissue  segmentation.  It  is  a 
statistical-based  method  developed  in  our  lab  [1].  The  segmentations  were  taken  on  both 
cancerous  and  normal  mammograms  at  screening-detected  and  screening-missed  stages 
respectively.  The  percentage  of  segmented  density  tissue  area  out  of  the  whole  breast  area  is 
calculated  as  the  index  of  breast  density.  Figure  1  shows  the  histograms  of  density  index  of  three 
different  type  mammograms.  To  check  the  correlation  of  density  between  mammograms  at 
missed  and  detected  stages,  two  kinds  of  correlation  analysis,  i.e.  Pearson’s  correlation  and 
Spearman’s  Rank  correlation,  were  taken  [2].  The  Pearson  correlation  coefficient  measures  the 
strength  and  direction  of  a  linear  relationship  between  two  variables.  One  problem  is  that  if 
there  are  outliers  in  the  data,  Pearson's  correlation  coefficient  will  be  greatly  affected.  Also, 
Pearson's  correlation  coefficient  only  measures  linear  relationships  between  variables. 
Spearman’s  rank  correlation  coefficient  is  a  nonparametric  (distribution-free)  rank  statistic  which 
is  a  measure  of  strength  of  the  associations  between  two  variables.  As  this  measure  depends 
only  on  ranks  it  is  not  affected  by  outliers.  The  correlation  coefficients  are  listed  in  Table  1.  It  is 
observed  that  (i)  there  is  a  good  consistency  between  the  Pearson’s  correlation  and  Spearman’s 
Rank  correlation,  i.e.  no  significant  outliers  exist  in  density  segmentation;  (ii)  the  breast  density 
segmented  at  missed  stage  is  correlated  to  that  at  detected  stage;  (iii)  the  segmentation 
correlation  between  normal  mammograms  at  missed  and  detected  stages  is  higher  than  that  with 
cancerous  mammograms.  An  explanation  is  that  the  cancerous  mammogram  usually  has  more 
complicated  density  pattern  and  is  statistically  of  higher  density  as  shown  below,  which  makes 
big  variations  in  segmentation. 


Table  1.  Correlation  of  Density  Segmentation. 


Variable  1 

Variable  2 

Pearson’s 

correlation 

coefficients 

Spearman’s 

correlation 

coefficients 

Missedcancer 

Detected_cancer 

0.5896 

0.5946 

Missed  normal 

Detected  normal 

0.6908 

0.6882 
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Missed-Cancer 


Detected-Cancer 


Figure  1.  Histograms  of  breast  density:  (a)  cancerous  mammogram  at  missed  stage;  (b)  cancerous  mammogram  at 
detected  stage;  (c)  normal  mammogram  at  missed  stage;  (d)  normal  mammogram  at  detected  stage. 


(2)  Density  analysis  of  normal  and  cancerous  mammograms 

A  set  of  statistical  testing  was  taken  to  exam  (i)  Is  there  any  difference  in  density  between  the 
mammograms  at  the  detected  stage  and  that  at  missed  stage?  (ii)  Is  there  any  difference  in 
density  between  the  normal  mammograms  and  the  cancerous  mammograms?  Listed  in  Table  2 
are  the  p-values  of  T-test  and  Wilcoxon  rank  test  for  density  difference  between  detected  stage 
mammogram  and  missed  stage  mammogram,  and  the  normal  mammogram  and  cancerous 
mammogram  respectively.  If  the  difference  of  density  index  is  normally  distributed,  we  use  t-test 
otherwise  use  Wilcoxon  rank  test.  If  the  test  />-value  is  less  than  0.05,  we  have  evidence  to  reject 
null  hypothesis  that  the  mean  of  difference  is  zero  at  significant  level  0.05,  i.e.  significantly 
different  [2].  It  is  observed  that  (i)  there  is  no  significant  change  in  density  of  mammograms  at 
detected  and  missed  stages  for  both  the  normal  and  cancerous  mammograms.  It  is  because  most 
of  the  mammograms  at  missed  and  detected  stages  were  taken  in  consecutive  years  as  shown  in 
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Figure  2,  during  which  no  significant  change  could  have  happened  on  breast,  (ii)  There  is  a 
significant  difference  in  density  between  normal  and  cancerous  mammograms  at  both  detected 
and  missed  stages.  Specifically  the  cancerous  mammograms  have  a  higher  density  than  normal 
mammograms. 


Table  2.  Statistical  Test  of  Density  Difference 


Variable  1 

Variable  2 

T-  test 
p-value 

Wilcoxon  test 
p-value 

Missed_cancer 

Detectedcancer 

0.4793 

0.5919 

Missednormal 

Detectednormal 

0.6708 

0.5326 

Missed_cancer 

Missed  _normal 

5.977e-07 

3.339e-06 

Detectedcancer 

Detectednormal 

2.579e-06 

5.067e-06 

Years  of  Interval 

Figure  2.  A  distribution  of  interval  between  mammograms  taken  at  missed  and  detected  stages. 

(3)  Effect  of  density  pattern  on  CAD  detection  performance 

In  the  study  described  above,  we  have  demonstrated  the  statistical  difference  in  breast  density 
between  the  normal  and  cancerous  mammograms.  It  has  also  been  reported  that  the  lesions 
occurred  in  dense  breasts  are  statistically  more  likely  to  be  missed  in  screening  mammogram  [3]. 
However  there  are  few  reports  on  study  of  the  effect  of  density  on  CAD  detection  performance, 
especially  that  at  different  detection  stages.  In  this  research,  as  a  baseline  study,  we  used  our 
existing  CAD  algorithm  for  detection  testing  on  the  serial  database  with  an  intention  to  examine 
the  differences  in  detection  performance  for  cases  with  different  breast  density.  The  detailed 
technical  information  on  the  CAD  algorithm  can  be  found  in  [4] [5]. 

Wolf  applied  a  method  of  classification  of  parenchymal  patterns  that  used  qualitative,  as  well  as 
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quantitative  criteria.  He  described  a  four  category  image  classification  method  based  on  the 
amounts  of  density  and  duct  work  present,  designated  by  Nl,  PI,  P2,  and  DY  [6].  N1  category 
refers  to  parenchyma  composed  primary  of  fat  with,  at  most,  small  amounts  of  dysplasia;  no 
ducts  visible.  The  PI  corresponds  to  the  presence  of  prominent  duct  work  that  occupies  up  to 
one-quarter  of  breast  volume.  P2  category  refers  to  sever  involvement,  with  prominent  ductal 
pattern  occupying  more  than  one-fourth  the  volume  of  breast.  DY  refers  to  mostly  dense  tissue. 
Due  to  the  limited  size  of  database,  in  order  to  obtain  a  statistically  significant  result,  we  classify 
the  mammograms  into  two  categories  with  density  percentages  less  or  more  than  25% 
respectively,  which  roughly  correspond  to  categories  (Nl,  PI)  and  (P2,  DY)  in  Wolfs 
classification.  Figure  2  and  3  show  the  FROC  curves  of  CAD  detection  results  of  high  (>25%) 
and  low  (<25%)  density  cases  at  missed  and  detected  stages  respectively.  Please  note  that  the 
sensitivity  is  defined  here  as  hit  rate  per  image.  If  the  criteria  of  detection  were  defined  as  the 
lesion  is  marked  by  CAD  on  one  or  both  mammographic  views,  which  is  used  by  most 
commercial  CAD  system  evaluation  reports,  we  could  expect  a  much  higher  sensitivity  at  the 
same  false  positive  rates  (per  image).  It  is  observed  that  (i)  the  detection  performance  on  less 
dense  case  is  better  than  that  on  high  dense  cases.  In  other  words,  similar  to  the  radiologists  in 
mammogram  screening,  the  lesions  occurred  in  dense  breasts  are  more  likely  to  be  missed  in 
CAD  detection;  (ii)  the  difference  of  detection  performance  between  high  and  low  dense  cases  is 
smaller  at  the  detection  stage  than  that  at  missed  stage,  i.e.  the  lesions  on  dense  mammograms 
are  even  more  difficult  to  detect  compared  to  the  lesions  on  low  dense  mammograms  at  the 
missed  stage. 


Detection  Performance  at  Detected  Stage 

[—»-density<25%  -m-  density>25%  | 

1  1 
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0  -I - - - , - , - 1 - 1 - . - 1 - 1 - . - r- - I - 1 

0  0.5  1  1.5  2  2.5  3  3.5  4  4.5  5  5.5 

False  Positives  Per  Image 

Figure  3.  FROC  curves  of  CAD  cancer  detection  on  mammograms  at  screening  detected  stage. 
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Detection  Performance  at  Missed  Stage 
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Figure  4.  FROC  curves  of  CAD  cancer  detection  on  mammograms  at  screening  missed  stage. 
Objective  2:  to  design  new  CAD  system  for  improving  missed  cancer  detection 
Accomplishments: 

The  new  CAD  system  is  based  on  our  two  generations  of  CAD  algorithms  for  mass  detection 
using  digitized  mammogram  [4][5]  and  incorporates  the  analysis  results  of  missed  cancer  in  the 
design.  The  strategies  taken  in  this  study  include  (a)  Multi-mode  detection  by  breast  density 
classification:  It  has  been  demonstrated  in  the  baseline  testing  study  by  using  existing  CAD 
algorithm  that  the  lesions  occurred  in  dense  breasts  are  more  likely  to  be  missed  in  CAD 
detection.  Therefore,  in  order  to  improve  the  detection  of  missed  cancer,  a  multi-mode  detection 
was  performed  by  classifying  the  mammogram  with  breast  density  index  as  defined  above  before 
an  appropriate  detection  mode  is  applied  to  the  detection.  As  explained  above,  due  to  the  limited 
size  of  database  in  this  study,  each  input  mammogram  was  classified  into  two  categories 
corresponding  to  density  percentages  less  or  more  than  25%.  (b)  Breast  area  partition  and 
region  based  adaptive  detection :  Due  to  the  fact  that  the  location  of  cancer  appearance  in 
mammograms  has  a  big  variation  in  missing  probability  in  screening  mammogram,  breast  area 
partition  provides  the  basis  for  further  adaptive  processing.  The  partition  process  consists  of 
three  steps:  (i)  breast  boundary  and  nipple  detection;  (ii)  pectoral  muscle  and  view  (CC  or  MLO) 
identification;  (iii)  area  partition.  Figure  5  shows  the  likelihood  of  missed  cancers  in  each  region. 
(c)  Weighted  classification  using  the  distinguishing  features  identified  in  missed  cancer  analysis: 
The  classification  is  a  modified  hybrid  structure  in  which  (i)  a  combined  "hard"  and  "soft" 
decision  classification  strategy  was  applied  [4][5];  (ii)  decision  thresholds  were  adjusted  based 
on  the  missed  cancer  feature  analysis.  For  example,  a  significant  difference  in  feature  “mass 
size”  was  observed  between  detected  and  missed  stages,  therefore  the  threshold  for  this  feature  in 
decision  tree  was  reduced  in  order  to  enhance  the  chance  of  missed  cancer  to  be  detected;  (iii) 
candidate  competition  are  weighted  using  region  likelihood  value.  Figure  6  and  Figure  7  show 
the  FROC  curves  of  detection  on  mammograms  of  missed  and  detected  cancer  stages  before  and 
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after  improvement.  It  is  observed  that  the  new  CAD  system  provides  a  better  detection 
performance  at  both  missed  and  detected  stages.  However,  because  the  new  CAD  is  designed 
with  focus  on  missed  cancer,  a  bigger  improvement  is  obtained  for  missed  cancer  detection. 


£3  CC  View 


Location 


(a) 


|BMLO  View] 


Location 


(b) 

Figure  5.  Distribution  of  cancers  at  different  locations  on  (a)  CC  view  and  (b)  MLO  view,  where  SA=Subareol, 
C=Central,  CL=Lower-Central,  CU=Upper-Central,  RC=Central-Retroglandular,  RU=Upper-Retroglandular, 
RL=Lower-Retroglandular,  L=Lateral,  CL=Central-Lateral,  CM=Medial-Central,  RM=Medial-Retroglandular, 
RL=Lateral-Retroglandu!ar. 
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KEY  RESEARCH  ACCOMPLISHMENTS 


1 .  A  comprehensive  analysis  was  taken  on  the  effect  of  breast  density  on  cancer  detection. 
The  accomplishments  include  breast  dense  tissue  segmentation,  correlation  analysis  of 
mammogram  density  features  between  missed  and  detected  stages,  statistical  testing  of 
density  difference  between  normal  and  cancerous  mammograms,  baseline  study  of  the 
effect  of  density  on  CAD  detection  performance  using  existing  algorithm. 

2.  A  new  CAD  system  was  designed  based  on  the  existing  second-generation  CAD 
algorithm  and  the  missed  cancer  analysis.  Due  to  the  effective  modification  strategies 
taken  in  the  new  system,  detection  performance  was  improved  for  mammograms  at  both 
detected  and  missed  stages.  However,  with  the  focus  on  missed  cancer  analysis  and 
detection,  a  bigger  improvement  was  obtained  in  detecting  missed  cases  even  though  the 
general  detection  performance  is  still  lower  than  that  at  detected  stage. 


REPORTABLE  OUTCOMES 

1.  Presentation  and/or  proceedings  paper 

(a)  Lihua  Li,  Zuobao  Wu,  Zhao  Chen,  Angela  Salem,  Maria  Kallergi,  Claudia  G.  Berman 
“Statistical  Analysis  of  Missed  Cancer  Features  in  Screening  Mammography,”  Proceedings  of 
SPIE  Medical  Imaging,  2005. 


CONCLUSIONS 

This  project  is  to  explore  an  innovative  CAD  strategy  for  improving  early  detection  of  breast 
cancer  in  screening  mammograms  by  focusing  on  computerized  analysis  and  detection  of  cancers 
missed  by  radiologists.  The  research  in  this  second  year  is  on  (i)  continuation  of  missed  cancer 
analysis  with  a  focus  on  density  analysis  and  its  effect  on  CAD  detection;  (ii)  new  CAD  system 
design.  A  big  progress  has  been  made  in  this  past  year.  The  results  demonstrated  the 
effectiveness  of  this  study  in  improving  detection  performance. 
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